Search CORE

95 research outputs found

Task-Agnostic Graph Neural Network Evaluation via Adversarial Collaboration

Author: Beaini Dominique
Liò Pietro
Stärk Hannes
Zhao Xiangyu
Zhao Yiren
Publication venue
Publication date: 26/03/2023
Field of study

It has been increasingly demanding to develop reliable methods to evaluate the progress of Graph Neural Network (GNN) research for molecular representation learning. Existing GNN benchmarking methods for molecular representation learning focus on comparing the GNNs' performances on some node/graph classification/regression tasks on certain datasets. However, there lacks a principled, task-agnostic method to directly compare two GNNs. Additionally, most of the existing self-supervised learning works incorporate handcrafted augmentations to the data, which has several severe difficulties to be applied on graphs due to their unique characteristics. To address the aforementioned issues, we propose GraphAC (Graph Adversarial Collaboration) -- a conceptually novel, principled, task-agnostic, and stable framework for evaluating GNNs through contrastive self-supervision. We introduce a novel objective function: the Competitive Barlow Twins, that allow two GNNs to jointly update themselves from direct competitions against each other. GraphAC succeeds in distinguishing GNNs of different expressiveness across various aspects, and has demonstrated to be a principled and reliable GNN evaluation method, without necessitating any augmentations.Comment: 11th International Conference on Learning Representations (ICLR 2023) Machine Learning for Drug Discovery (MLDD) Workshop. 17 pages, 6 figures, 4 table

arXiv.org e-Print Archive

Augmentation Backdoors

Author: Mullins Robert
Rance Joseph
Shumailov Ilia
Zhao Yiren
Publication venue
Publication date: 29/09/2022
Field of study

Data augmentation is used extensively to improve model generalisation. However, reliance on external libraries to implement augmentation methods introduces a vulnerability into the machine learning pipeline. It is well known that backdoors can be inserted into machine learning models through serving a modified dataset to train on. Augmentation therefore presents a perfect opportunity to perform this modification without requiring an initially backdoored dataset. In this paper we present three backdoor attacks that can be covertly inserted into data augmentation. Our attacks each insert a backdoor using a different type of computer vision augmentation transform, covering simple image transforms, GAN-based augmentation, and composition-based augmentation. By inserting the backdoor using these augmentation transforms, we make our backdoors difficult to detect, while still supporting arbitrary backdoor functionality. We evaluate our attacks on a range of computer vision benchmarks and demonstrate that an attacker is able to introduce backdoors through just a malicious augmentation routine.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

To Compress Or Not To Compress: Understanding The Interactions Between Adversarial Attacks And Neural Network Compression

Author: Anderson Ross
Mullins Robert
Shumailov Ilia
Zhao Yiren
Publication venue
Publication date: 31/03/2019
Field of study

Edinburgh Research Explorer

To compress or not to compress: Understanding the Interactions between Adversarial Attacks and Neural Network Compression

Author: Anderson Ross
Mullins Robert
Shumailov Ilia
Zhao Yiren
Publication venue
Publication date: 31/03/2019
Field of study

As deep neural networks (DNNs) become widely used, pruned and quantised models are becoming ubiquitous on edge devices; such compressed DNNs are popular for lowering computational requirements.Meanwhile, recent studies show that adversarial samples can be effective at making DNNs misclassify. We, therefore, investigate the extent to which adversarial samples are transferable between uncompressed and compressed DNNs. We find that adversarial samples remain transferable for both pruned and quantised models.For pruning, the adversarial samples generated from heavily pruned models remain effective on uncompressed models. For quantisation, we find the transferability of adversarial samples is highly sensitive to integer precision.Partially supported with funds from Bosch-Forschungsstiftung im Stifterverban

arXiv.org e-Print Archive

Edinburgh Research Explorer

Apollo (Cambridge)

Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs

Author: Li Zehui
Liò Pietro
Shen Mingzhu
Stan Guy-Bart
Zhao Xiangyu
Zhao Yiren
Publication venue
Publication date: 08/06/2023
Field of study

Graphs are widely used to encapsulate a variety of data formats, but real-world networks often involve complex node relations beyond only being pairwise. While hypergraphs and hierarchical graphs have been developed and employed to account for the complex node relations, they cannot fully represent these complexities in practice. Additionally, though many Graph Neural Networks (GNNs) have been proposed for representation learning on higher-order graphs, they are usually only evaluated on simple graph datasets. Therefore, there is a need for a unified modelling of higher-order graphs, and a collection of comprehensive datasets with an accessible evaluation framework to fully understand the performance of these algorithms on complex graphs. In this paper, we introduce the concept of hybrid graphs, a unified definition for higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains 23 real-world hybrid graph datasets across various domains such as biology, social media, and e-commerce. Furthermore, we provide an extensible evaluation framework and a supporting codebase to facilitate the training and evaluation of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various research opportunities and gaps, including (1) evaluating the actual performance improvement of hypergraph GNNs over simple graph GNNs; (2) comparing the impact of different sampling strategies on hybrid graph learning methods; and (3) exploring ways to integrate simple graph and hypergraph information. We make our source code and full datasets publicly available at https://zehui127.github.io/hybrid-graph-benchmark/.Comment: Preprint. Under review. 16 pages, 5 figures, 11 table

arXiv.org e-Print Archive

Revisiting Automated Prompting: Are We Actually Doing Better?

Author: Gal Yarin
Mullins Robert
Shumailov Ilia
Zhao Yiren
Zhou Yulin
Publication venue
Publication date: 22/06/2023
Field of study

Current literature demonstrates that Large Language Models (LLMs) are great few-shot learners, and prompting significantly increases their performance on a range of downstream tasks in a few-shot learning setting. An attempt to automate human-led prompting followed, with some progress achieved. In particular, subsequent work demonstrates automation can outperform fine-tuning in certain K-shot learning scenarios. In this paper, we revisit techniques for automated prompting on six different downstream tasks and a larger range of K-shot learning settings. We find that automated prompting does not consistently outperform simple manual prompts. Our work suggests that, in addition to fine-tuning, manual prompts should be used as a baseline in this line of research

arXiv.org e-Print Archive

Wide Attention Is The Way Forward For Transformers?

Author: Brown Jason Ross
Mullins Robert D
Shumailov Ilia
Zhao Yiren
Publication venue
Publication date: 08/11/2022
Field of study

The Transformer is an extremely powerful and prominent deep learning architecture. In this work, we challenge the commonly held belief in deep learning that going deeper is better, and show an alternative design approach that is building wider attention Transformers. We demonstrate that wide single layer Transformer models can compete with or outperform deeper ones in a variety of Natural Language Processing (NLP) tasks when both are trained from scratch. The impact of changing the model aspect ratio on Transformers is then studied systematically. This ratio balances the number of layers and the number of attention heads per layer while keeping the total number of attention heads and all other hyperparameters constant. On average, across 4 NLP tasks and 10 attention types, single layer wide models perform 0.3% better than their deep counterparts. We show an in-depth evaluation and demonstrate how wide models require a far smaller memory footprint and can run faster on commodity hardware, in addition, these wider models are also more interpretable. For example, a single layer Transformer on the IMDb byte level text classification has 3.1x faster inference latency on a CPU than its equally accurate deeper counterpart, and is half the size. We therefore put forward wider and shallower models as a viable and desirable alternative for small models on NLP tasks, and as an important area of research for domains beyond this

arXiv.org e-Print Archive